Automatic Transcription of Spontaneous Lecture Speech

نویسندگان

  • Tatsuya Kawahara
  • Hiroaki Nanjo
  • Sadaoki Furui
چکیده

We introduce our extensive projects on spontaneous speech processing and current trials of lecture speech recognition. A large corpus of lecture presentations and talks is being collected in the project. We have trained initial baseline models and confirmed significant difference of real lectures and written notes. In spontaneous lecture speech, the speaking rate is generally faster and changes a lot, which makes it harder to apply fixed segmentation and decoding settings. Therefore, we propose sequential decoding and speaking-rate dependent decoding strategies. The sequential decoder simultaneously performs automatic segmentation and decoding of input utterances. Then, the most adequate acoustic analysis, phone models and decoding parameters are applied according to the current speaking rate. These strategies achieve improvement on automatic transcription of real lecture speech.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Speech Transcription and Archiving System using the Corpus of Spontaneous Japanese

The target of automatic speech recognition (ASR) research has been shifted from read speech to spontaneous speech. The technology will realize automatic transcription (and translation) of lectures and meetings. In Japan, ”Spontaneous Speech” project has been conducted in last five years, and we set up the huge ”Corpus of Spontaneous Japanese (CSJ)”, which consists of over 2000 speeches (500 hou...

متن کامل

Efficient Access to Lecture Audio Archives through Spoken Language Processing

The paper firstly addresses the current state of speech recognition using the “Corpus of Spontaneous Japanese (CSJ)”. It is shown that the large-scale corpus had strong impact in training acoustic and language models considering morphological and pronunciation variations which are characteristic to spontaneous Japanese. Unsupervised adaptation of these models and the speaking rate is also effec...

متن کامل

Filled Pause Refinement Based on the Pronunciation Probability for Lecture Speech

Nowadays, although automatic speech recognition has become quite proficient in recognizing or transcribing well-prepared fluent speech, the transcription of speech that contains many disfluencies remains problematic, such as spontaneous conversational and lecture speech. Filled pauses (FPs) are the most frequently occurring disfluencies in this type of speech. Most recent studies have shown tha...

متن کامل

Automatic Transcription of Lecture Speech using Language Model Based on Speaking-Style Transformation of Proceeding Texts

For language modeling of spontaneous speech recognition, we propose a style transformation approach, which transforms written texts to a spoken-style language model. Since these two styles are largely different and thus direct transformation is difficult, we cascade two transformation methods; rule-based transformation to rewrite written-style texts to intermediate “verbatim” texts, and statist...

متن کامل

Speaking rate dependent acoustic modeling for spontaneous lecture speech recognition

The paper addresses large vocabulary spontaneous speech recognition focusing on acoustic modeling that considers the speaking rate. Using the real lecture speech corpus collected under the priority research project in Japan, we have made baseline acoustic model, and evaluated on the automatic transcription of oral presentations by experienced speakers and obtained word accuracy of 58.2%. Compar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001